Login and get codingIn statistics and statistical physics, the Metropolis–Hastings algorithm is a Markov chain Monte Carlo (MCMC) method for obtaining a sequence of random samples from a probability distribution from which direct sampling is difficult.
This sequence can be used to approximate the distribution (e.g. to generate a histogram) or to compute an integral (e.g. an expected value).
Metropolis–Hastings and other MCMC algorithms are generally used for sampling from multi-dimensional distributions, especially when the number of dimensions is high.
Metropolis Algorithm
For the purpose of illustration, you will implement a special case of the Metropolis–Hastings algorithm where the proposal function (which candidate to choose next) is symmetric (all candidates with the same distance to the current value are equally likely).
Given a function
f
from which we want to drawsN
samples to approximate this function.f
should accept a single parameterx
and return the function's value forx
, that isf(x)
.1. Initialization: Choose an arbitrary point
x_0
to be the first observation in the list of samples and choose the normal (Gaussian) distribution as a proposal function that suggests a candidate for the next sample valuex_next
, given the previous sample valuex_t
.The Gaussian distribution is centered at the current sample
x_t
, so that points closer tox_t
are more likely to be visited next, making the sequence of samples into a random walk.2. For each iteration
t
:- Generate a candidate
x_next
for the next sample by drawing a random sample from the proposal distribution that is centered at the current samplex_t
.- Calculate the acceptance ratio
α=f(x_next)/f(x_t)
, which will be used to decide whether to accept or reject the candidate (the ratio is the probability to move tox_next
).- Accept or reject:
- Generate a uniform random number
u
drawn from the interval 0 to 1.- If
u ≤ α
, then accept the candidatex_next
as next sample.- If
u > α
, then reject the candidate and stay atx_t
, that isx_next=x_t
Task
Your task in this exercise is to implement the Metropolis-Hastings algorithm.
Your function will be used to approximate the mean and standard deviation values for a range of common univariate distributions like the normal distribution.
You are provided with a template for the function
metropolis_hastings
.You have to implement the above mentioned algorithm in Python.
The function accepts an arbitrary probability density
f
(will be provided, see the tests), a start valuex_0
and the number of samplesn_samples
to draw fromf
.Step one of the algorithm is already done by passing
x_0
to the function (or setting it to zero by default).
x_0
is your first sample and from there you will move either to some point in the neighborhood or stay atx_0
.For this bite, you use a normal distribution as proposal distribution with a standard deviation of one and a mean value that is equal to the current sample (
x_0
at the beginner andx_t
afterwards).Refer to the tests to see how the function is called and how the statistics of
f
are tested.Example
To test your function, you could try to guess the mean and standard deviation of a normal distribution like in the following example.
import numpy as np
def norm_dist(x, mean, std):
"""Gaussian normal probability distribution."""
return np.exp(-0.5 * (x - mean) ** 2 / std ** 2)
samples = metropolis_hastings(f=lambda x: norm_dist(x, mean=0, std=1), x_0=-1, n_samples=100)
print(samples.mean(), samples.std())If your implementation is right, this should return an approximation of the true mean (0) and standard deviation (1).
Hints
numpy
will come handy in this exercise as you can use it to sample both from a normal distribution and from a uniform distribution.See
numpy.random
for further details.If you want to take things up a notch, you should look into
scipy.stats
, the module of scipy that contains a large number of probability distributions, summary and frequency statistics, correlation functions and statistical tests. For example,scipy.stats
has anorm
function that can be used to draw samples from a normal distribution (norm.rvs(size=100)
), to calculate the probability of an x-value (norm.pdf(x)
), and to calculate the cumulative probability of an x-value (norm.cdf(x)
).As I said,
scipy.stats
is optional and just for the curious, you can solve this bite withnumpy
alone.Happy coding!
For more background on the Metropolis–Hastings algorithm check out this page.
Further Reading Material
If you are interested in the topic, I recommend Computational Statistics in Python with a lot of code to see the concepts in action!
If you want to take things up a notch, you should look into
scipy.stats
, the module of scipy that contains a large number of probability distributions, summary and frequency statistics, correlation functions and statistical tests. For example,scipy.stats
has anorm
function that can be used to draw samples from a normal distribution (norm.rvs(size=100)
), to calculate the probability of an x-value (norm.pdf
(x)
), and to calculate the cumulative probability of an x-value (norm.cdf(x)
).As I said,
scipy.stats
is optional and just for the curious, you can solve this bite withnumpy
alone. Be aware thatscipy
is not available on the pybites platform, so you have to try it locally.
15 out of 15 users completed this Bite.
Will you be the 16th person to crack this Bite?
Resolution time: ~74 min. (avg. submissions of 5-240 min.)